CMSC 408 Deliverable 8 Database Design
Financial Data Manager Project
Project Overview and Key Deliverables
The Financial Data Manager project is a project created by two Computer Science majors also pursuing minors in General Business. The purpose of this project is to aid investors in making the best investment choices in the stock market. Our database provides the necessary financial data that an investor can find highly useful to evaluate a company’s financial health and wealth, and we believe that a database will not only reduce time, but also get future investors a chance to invest utilizing a wise tool to aid them :).
URL’s
Problem Description
Problem domain: Creating a database with this data would address a problem faced by investors and financial analysts. Companies can have different means of reporting which makes it inefficient and time-consuming to look at. Specifically, finding and retrieving the exact information is a waste of time for investors who want to get their data as soon as possible to make investment decisions, especially with the fast-paced and changing nature of the markets. For example, I had to look for information on multiple website wasting around 3-4 hours of time just searching for the most accurate information in order to make stock pitches for the Student Managed Investment Portfolio (SMIP) which is an investment club at VCU.
Need: A database is needed to store and organize financial information and is useful for quicker decision-making processes and better management of all the existing data.
Context, scope, and perspective: The database is useful for investors and for those who want to get experience reading financial statements to make investment decisions. It takes a long time to look for accurate information about a company and we thought it would be better to have all that important information in one place for users to access that data. For example, if I wanted to decide whether or not to invest in Nvidia, I would first take a look at financial statements to see the financial health of the company to see how they’re doing and then decide whether or not the stock would be a buy, sell, or hold.
User roles and use cases: The consumer target is primarily investors in the stock market whether it is a millionaire or a novice who is just getting started. These are the people who will mostly be using our database to fetch financial data for a company they wish to know more about. For example, if an investor wants to get the earningsfor a company like Apple, they can use the database to search the company and find the earnings at a specific point of time. If an investor decides whether or not buying Apple stock would be a good decision, they can use earnings as one of the indicators to make that decision utilizing the database with the data we provide. The database will be use an API that will fetch data from a trusted source to store it in the database.
Security and Privacy: The only security concern that we could think of is whether or not the information from the source we get it from is accurate. Another issue could be that in case there is a data breach and all information is lost, then it will be harder for our customers to gain access to essential information.
Database Design
- Company: represents a business organization and has a unique ID.
- Stock: represents stock information and has a unique stock ticker.
- Income_Statement: represents financial performance and has a unique statement ID.
- Balance_Sheet: represents a company’s financial position and has a unique statement ID.
- Cash_Flow: represents cash inflows and outflows and has a unique statement ID.
- Earnings: represents financial earnings and has a unique earnings ID.
Entity-relationship diagrams: (Note: We made the attributes shorter to make the diagram more readable.)
erDiagram
COMPANY {
string ticker
string name
string sector
string industry
}
INCOME_STATEMENT {
int statement_id1
string ticker
date fiscal_year_end
float total_revenue
float cost_of_revenue
float gross_profit
float operating_income
float Net_income_common_stockholders
float income_before_tax
float ebit
float ebitda
}
STOCK {
string ticker
int company_id
float price
date IPO_date
}
BALANCE_SHEET {
string ticker
int statement_id2
date fiscal_year_end
float total_assets
float current_assets
float cash
float cash_and_cash_equivalents
float accounts_receivables
float total_debt
float net_tangible_assets
float long_term_debt
float working_capital
float invested_capital
float tangible_book_value
float total_capitalization
int shares_issued
float stockholders_equity
float retained_earnings
float common_stock_equity
}
CASH_FLOW {
int statement_id3
string ticker
date fiscal_year_end
float operating_cashflow
float capital_expenditures
float free_cash_flow
float cash_dividends_paid
}
EARNINGS {
int earnings_id
string ticker
int earnings_id
int company_id
date fiscal_date_end
float reported_eps
}
COMPANY ||--o{ INCOME_STATEMENT : publishes
COMPANY ||--o{ STOCK : issues
COMPANY ||--o{ BALANCE_SHEET : records
COMPANY ||--o{ CASH_FLOW : flows
COMPANY ||--o{ EARNINGS : reports
Relational Schemas
Note that PK is (Primary Key) and FK is (Foreign Key)
- Company(ticker (PK, string), name (string), sector (string), industry (string))
- This relation simply associates a company with the name, industry, and ticker through the company ID.
- Stock(companyID (int), ticker (FK, string), price (float), IPO_date (date))
- This relation associates a stock with the price, IPO_date, companyID through the stockID.
- Income_Statement(statement_id (PK, int), fiscal_year_end (date), total_revenue (float), cost_of_revenue (float), gross_profit (float), operating_income (float), Net_income_common_stockholders (float), pretax_income (float), ebit (float), ebitda (float), ticker (FK, string))
- This relation stores details of a income statement for a company.
- Balance_Sheet(statement_id (PK, int), ticker (FK, string), fiscal_year_end (date), total_assets (float), current_assets (float), cash_and_cash_equivalents (float), accounts_receivable (float), total_debt (float), long_term_debt (float), net_tangible_assets (float), working_capital (float), invested_capital (float), tangible_book_value (float), total_capitalization (float), shares_issued (int), stockholders_equity (float), retained_earnings (float), common_stock_equity (float))
- This relation stores details of a balance sheet for a company.
- Cash_Flow(statement_id (PK, int), ticker (FK, string), fiscal_year_end (date), operating_cashflow (float), capital_expenditures (float), free_cash_flow (float), cash_dividends (float))
- This relation stores the cash flow data for a company.
- Earnings(earnings_id (PK, int), ticker (FK, string), fiscal_date_end (date), reported_eps (float))
- This relation stores earnings data for a company like reported earnings per share (EPS).
Functional Dependencies and Normalization
Functional Dependencies
- Company: PK - ticker, FDs ticker → name, sector, industry
- Income_Statement: PK - statementID, FDs statementID → fiscal_year_end, total_revenue, cost_of_revenue, gross_profit, operating_income, Net_income_common_stockholders, income_before_tax, ebit, ebitda, ticker
- Stock: PK - company_id, FDs - company_id → eod_price, IPO_date, ticker
- Balance_Sheet: PK - statement_id, FDs statement_id → ticker, fiscal_year_end, total_assets, current_assets, cash_and_cash_equivalents, accounts_receivable, total_debt, long_term_debt, net_tangible_assets, working_capital, invested_capital, tangible_book_value, total_capitalization, shares_issued, stockholders_equity, retained_earnings, common_stock_equity
- Cash_Flow: PK - statement_id, FDs statement_id → ticker, fiscal_year_end, operating_cashflow, capital_expenditures, free_cash_flow, cash_dividends
- Earnings: PK - earnings_id, FDs earnings_id → ticker, fiscal_date_end, reported_eps
Normalization
- Company: This relation is already in BCNF. There is one candidate key.
- Income_Statement: This relation is already in BCNF. There is one candidate key.
- Stock: This relation is already in BCNF. There is one candidate key.
- Balance_Sheet: This relation is already in BCNF. There is one candidate key.
- Cash_Flow: This relation is already in BCNF. There is one candidate key.
- Earnings: This relation is already in BCNF. There is one candidate key.
Specific Queries
- List all companies and their stock tickers.
Relational Algebra:
\pi_{\text{Name, Ticker}}(\text{Company})
- Find all companies from the Technology sector.
Relational Algebra:
\sigma_{\text{Sector='Technology'}}(\text{Company})
- Get the stock ticker and price for Apple.
Relational Algebra:
\sigma_{\text{Stock\_ticker='AAPL'}}(\text{Stock})
- Find the companies with IPO dates after 2010
Relational Algebra:
\pi_{\text{Name}}(\sigma_{\text{IPO\_date} > '2010-01-01'}(\text{Company}))
- Retrieve the total revenue and net income for Apple
Relational Algebra:
\pi_{\text{total\_revenue}, \text{net\_income}}(\sigma_{\text{Stock\_ticker} = 'AAPL'}(\text{Income\_Statement}))
- Get the total debt for all companies in the Consumer Cyclical sector with fiscal year end
Relational Algebra:
\pi_{\text{Name}, \text{total\_debt}}(\sigma_{\text{sector} = 'Consumer Cyclical'}(\text{Balance\_Sheet}))
- List all companies with a gross profit greater than 50 million
Relational Algebra:
\pi_{\text{Name}}(\sigma_{\text{gross\_profit} > 50000000}(\text{Income\_Statement}))
- Find the companies with total assets greater than 1 billion
Relational Algebra:
\pi_{\text{Name}}(\sigma_{\text{total\_assets} > 1000000000}(\text{Balance\_Sheet}))
- Get the fiscal year-end date for Microsoft from 2021 to 2024
Relational Algebra:
\pi_{\text{fiscal\_year\_end}}(\sigma_{\text{Name} = 'Microsoft'}(\text{Stock} \bowtie \text{Company}))
- Find all companies that have a working capital greater than 50 million
Relational Algebra:
\pi_{\text{Name}}(\sigma_{\text{working\_capital} > 50000000}(\text{Balance\_Sheet}))
- Get the names of companies in the ‘Software Infrastructure’ industry
Relational Algebra:
\pi_{\text{Name}}(\sigma_{\text{industry} = 'Software Infrastructure'}(\text{Company}))
- Retrieve the total revenue and cost of revenue for Tesla
Relational Algebra:
\pi_{\text{total\_revenue}, \text{cost\_of\_revenue}}(\sigma_{\text{Name} = 'Tesla'}(\text{Income\_Statement}))
- Find companies that have both net income for common stockholders greater than 10 million.
Relational Algebra:
\pi_{\text{Name}}(\sigma_{\text{net\_income} > 10000000}(\text{Income\_Statement}))
- Retrieve the operating cashflow for Nvidia.
Relational Algebra:
\pi_{\text{Operating\_cashflow}}(\sigma_{\text{Stock\_ticker} = \text{'NVDA'}}(\text{Cashflow}))
- Retreive the earnings for all tickers in the fiscal date end of 2022-12-31
Relational Algebra:
\pi_{\text{Ticker}, \text{Earnings}}(\sigma_{\text{Fiscal\_year\_end} = \text{'2022-12-31'}}(\text{Income\_Statement} \bowtie \text{Ticker}))
- Get the capital expenditures for companies which are greater than -2000000
Relational Algebra:
\pi_{\text{Name}, \text{Capital\_expenditures}}(\sigma_{\text{Capital\_expenditures} > -2000000}(\text{Cashflow}))
- Retrieve the stock ticker with fiscal year and cash dividends paid which are not null
Relational Algebra:
\pi_{\text{Stock\_ticker}, \text{Fiscal\_year\_end}, \text{Cash\_dividends\_paid}}(\sigma_{\text{Cash\_dividends\_paid} \neq \text{NULL}}(\text{Cash\_Flow}))
- Retrieve the stock ticker and free cash flow where free cash flow is greater than 50,000,000
Relational Algebra:
\pi_{\text{Stock\_ticker}, \text{Free\_cash\_flow}}(\sigma_{\text{Free\_cash\_flow} > 50000000}(\text{Cash\_Flow}))
- Retrieve the fiscal year and operating cash flow for records where operating cash flow is less than 10,000,000.
Relational Algebra:
\pi_{\text{Fiscal\_year\_end}, \text{Operating\_cashflow}}(\sigma_{\text{Operating\_cashflow} < 10000000}(\text{Cash\_Flow}))
- Retrieve the stock ticker, fiscal year, and capital expenditures for records where capital expenditures are negative.
Relational Algebra:
\pi_{\text{Stock\_ticker}, \text{Fiscal\_year\_end}, \text{Capital\_expenditures}}(\sigma_{\text{Capital\_expenditures} < 0}(\text{Cash\_Flow}))
Sample Data
- Company
| name | sector | industry | ticker (PK) |
|---|---|---|---|
| Apple Inc. | Technology | Electronics | AAPL |
| Microsoft | Technology | Software | MSFT |
| Tesla | Automotive | Internet | TSLA |
| JPMorgan | Financial | Bank | JPM |
| Pfizer | Healthcare | Healthcare | PFE |
- Stock
| company_id | stock_ticker | price | IPO_date | ticker (FK) |
|---|---|---|---|---|
| 1 | 1001 | 180.50 | 2023-01-01 | AAPL |
| 2 | 1002 | 320.12 | 2023-02-01 | MSFT |
| 3 | 1003 | 250.75 | 2023-03-01 | TSLA |
| 4 | 1004 | 140.88 | 2023-04-01 | JPM |
| 5 | 1005 | 39.25 | 2023-05-01 | PFE |
- Income_Statement
| statement_id | fiscal_year_end | total_revenue | cost_of_revenue | gross_profit | operating_income | Net_income_cs | income_before_tax | ebit | ebitda | ticker (FK) |
|---|---|---|---|---|---|---|---|---|---|---|
| 2001 | 2023-12-31 | 400000000000 | 200000000000 | 200000000000 | 120000000000 | 95000000000 | 110000000000 | 120000000 | 150000000 | AAPL |
| 2002 | 2023-12-31 | 250000000000 | 80000000000 | 170000000000 | 90000000000 | 75000000000 | 85000000000 | 90000000 | 110000000 | MSFT |
| 2003 | 2023-12-31 | 100000000000 | 70000000000 | 30000000000 | 15000000000 | 12000000000 | 14000000000 | 15000000 | 20000000 | TSLA |
| 2004 | 2023-12-31 | 50000000000 | 30000000000 | 20000000000 | 10000000000 | 8000000000 | 9000000000 | 10000000 | 13000000 | JPM |
| 2005 | 2023-12-31 | 25000000000 | 15000000000 | 10000000000 | 6000000000 | 5000000000 | 5500000000 | 6000000 | 8000000 | PFE |
- Balance_Sheet
| statement_id | fiscal_year_end | total_assets | current_assets | cash | cash_equivalents | total_debt | shares_issued | current_debt | long_term_debt | retained_earnings | common_stock_equity | ticker (FK) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3001 | 2023-12-31 | 500000000000 | 200000000000 | 300000000000 | 50000000000 | 200000000 | 15000000000 | 20000000000 | 50000000000 | 250000000000 | 500000000 | AAPL |
| 3002 | 2023-12-31 | 400000000000 | 150000000000 | 250000000000 | 40000000000 | 150000000 | 14000000000 | 15000000000 | 40000000000 | 200000000000 | 400000000 | MSFT |
| 3003 | 2023-12-31 | 300000000000 | 120000000000 | 180000000000 | 30000000000 | 100000000 | 13000000000 | 10000000000 | 30000000000 | 150000000000 | 300000000 | TSLA |
| 3004 | 2023-12-31 | 200000000000 | 100000000000 | 100000000000 | 20000000000 | 80000000 | 12000000000 | 5000000000 | 20000000000 | 75000000000 | 200000000 | JPM |
| 3005 | 2023-12-31 | 150000000000 | 50000000000 | 100000000000 | 10000000000 | 50000000 | 11000000000 | 3000000000 | 10000000000 | 50000000000 | 100000000 | PFE |
- Cash_Flow
| statement_id | fiscal_year_end | operating_cashflow | capital_expenditures | free_cash_flow | cash_dividends | ticker (FK) |
|---|---|---|---|---|---|---|
| 4001 | 2023-12-31 | 120000000000 | 50000000000 | 30000000000 | 20000000000 | AAPL |
| 4002 | 2023-12-31 | 100000000000 | 40000000000 | 25000000000 | 15000000000 | MSFT |
| 4003 | 2023-12-31 | 80000000000 | 30000000000 | 20000000000 | 10000000000 | TSLA |
| 4004 | 2023-12-31 | 60000000000 | 20000000000 | 15000000000 | 8000000000 | JPM |
| 4005 | 2023-12-31 | 40000000000 | 10000000000 | 10000000000 | 5000000000 | PFE |
- Earnings
| earnings_id | fiscal_date_end | reported_eps | ticker (FK) |
|---|---|---|---|
| 5001 | 2023-12-31 | 6.20 | AAPL |
| 5002 | 2023-12-31 | 5.50 | MSFT |
| 5003 | 2023-12-31 | 4.80 | TSLA |
| 5004 | 2023-12-31 | 3.10 | JPM |
| 5005 | 2023-12-31 | 2.50 | PFE |
Project Management
Draft Project Schedule
gantt
title Project Timeline
dateFormat YYYY-MM-DD
axisFormat %b %d
todayMarker stroke-width:2px,stroke:red,opacity:0.5
section Planning
Deliverable 8 :done, des1, 2024-09-15, 2024-10-13
Website Creation :active, des2, 2024-10-13, 2024-10-20
SQL Database Implementation :active, des2, 2024-10-20, 2024-11-04
Query Writing :active, des2, 2024-11-04, 2024-12-02
Testing and Refinement :active, des2, 2024-12-02, 2024-12-09
section Milestones
Deliverable 8 :milestone, m1, 2024-10-13, 1d
Deliverable 9 :milestone, m2, 2024-10-20, 1d
Deliverable 12 :milestone, m3, 2024-12-09, 1d